Problem Statement:¶
In the food industry, identifying food items quickly and accurately is essential for applications such as automated inventory management, calorie estimation, restaurant automation, and dietary monitoring. Manual identification is time-consuming, error-prone, and not scalable. Thus, there is a need for an automated, intelligent system that can classify food items from images with high accuracy.
Context:¶
In the era of digital transformation, automated food detection using computer vision has become increasingly important in various sectors such as hospitality, healthcare, fitness, retail, and food delivery. Accurate identification of food items from images enables intelligent systems to recognize what a person is eating, streamline restaurant operations, or even automate checkout processes in cafeterias.
For example, in a smart cafeteria, cameras can detect and identify food items on a tray without manual input, enabling a frictionless billing experience. In diet and nutrition apps, users can take a picture of their meal, and the app can instantly classify the food and estimate nutritional content. In quality assurance for food production, automated systems can detect if the right type of food is being processed or if items are visually defective.
Such applications demand a robust food classification model capable of identifying food items from images with high accuracy, regardless of variations in presentation, lighting, or camera angles. This project aims to tackle this challenge by leveraging deep learning techniques to train a model that can automatically detect and classify different types of food from a diverse dataset of labeled food images.
Data Descriptions:¶
The project uses a curated subset of the Food-101 dataset, a widely used benchmark for food classification tasks. This dataset includes:
500 images categorized into
10 distinct food classes (e.g., apple_pie, fried_rice, sushi)
Each class contains a balanced distribution of training and test images, generally split in a 70-30 ratio
Images vary in lighting, background, and angle to mimic real-world food photography conditions
Each image is labeled with the corresponding food class, enabling supervised learning approaches to be applied effectively.
Project Objective¶
The primary goal of this project is to:
Develop a deep learning-based food identification model that can accurately classify food items from images.
Key objectives include:
Building a convolutional neural network (CNN) model to classify food images into one of the 10 defined categories
Evaluating model performance using standard metrics such as accuracy, precision, recall and confusion matrix.
Enabling a potential real-time application where the trained model can be integrated into camera-based systems for smart kitchens, restaurant automation, or diet-tracking apps
Ultimately, this solution aims to demonstrate the feasibility of intelligent, camera-driven food recognition systems, contributing toward innovations in food technology and AI-driven lifestyle tools.
Step 1: Import the data¶
Importing Required Libraries¶
import os # File and directory operations
import pandas as pd # Data handling
import matplotlib.pyplot as plt # Plotting
import matplotlib.patches as patches # Drawing shapes on plots
import cv2 # Image processing
import numpy as np
Unzipping the Food-101 Dataset¶
# Define the path to the ZIP file containing the dataset
zip_path = 'Food_101.zip'
# Define the directory where the ZIP file should be extracted
extract_to = 'food101_data'
import zipfile # Importing the zipfile module to handle ZIP archives
# Open the ZIP file in read mode ('r') using a context manager
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
# Extract all contents of the ZIP file to the specified directory
zip_ref.extractall(extract_to)
# Print confirmation message after extraction is complete
print("Dataset unzipped!")
Dataset unzipped!
Exploratory Data Analysis¶
Verify Directory Structure¶
# List all files and directories in the specified path 'extract_to'
# 'extract_to' should be a variable that holds the path where your dataset was extracted
os.listdir(extract_to)
['.DS_Store', '__MACOSX', 'Food_101']
List classes¶
# Join the extraction directory with the 'Food_101' folder to get the full path
food101_dir = os.path.join(extract_to, 'Food_101')
# List all files and subdirectories in the 'Food_101' folder
# This will typically include folders like 'images' and files like 'meta'
os.listdir(food101_dir)
['ice_cream', 'samosa', 'donuts', '.DS_Store', 'waffles', 'falafel', 'ravioli', 'strawberry_shortcake', 'spring_rolls', 'hot_dog', 'apple_pie', 'chocolate_cake', 'tacos', 'pancakes', 'pizza', 'nachos', 'french_fries', 'onion_rings']
base_path = 'food101_data/Food_101/' # path to class folders
class_to_images = {}
for cls_name in os.listdir(base_path):
cls_folder = os.path.join(base_path, cls_name)
if os.path.isdir(cls_folder):
image_files = os.listdir(cls_folder)
class_to_images[cls_name] = image_files
# Summary
total_images = sum(len(v) for v in class_to_images.values())
print(f"Total classes: {len(class_to_images)}")
print(f"Total images: {total_images}")
Total classes: 17 Total images: 16257
for i, (cls, imgs) in enumerate(class_to_images.items()):
print(f"{cls}: {len(imgs)} images")
ice_cream: 1000 images samosa: 1000 images donuts: 1000 images waffles: 1000 images falafel: 1000 images ravioli: 1000 images strawberry_shortcake: 1000 images spring_rolls: 1000 images hot_dog: 1000 images apple_pie: 257 images chocolate_cake: 1000 images tacos: 1000 images pancakes: 1000 images pizza: 1000 images nachos: 1000 images french_fries: 1000 images onion_rings: 1000 images
Observation :
- Total Classes: There are 17 different food categories in your current dataset.
- Total Images: There are a total of 16,257 food images available.
- Uniformity: Most classes (like pizza, donuts, pancakes, etc.) have 1,000 images each, showing good class balance.
- Exception: Only one class, apple_pie, has fewer images (257 only) — this may cause imbalance in training.
- This dataset is suitable for multi-class image classification, and can also be extended to object detection if bounding boxes are added.
Class Distribution Plot¶
# 1. Class Distribution Plot
classes = list(class_to_images.keys())
counts = [len(imgs) for imgs in class_to_images.values()]
plt.figure(figsize=(12, 6))
plt.bar(classes, counts, color='skyblue')
plt.xticks(rotation=45, ha='right')
plt.xlabel('Food Classes')
plt.ylabel('Number of Images')
plt.title('Number of Images per Food Class')
plt.show()
Observation:
- Most classes contain exactly 1,000 images, which is ideal for training.
- Only one class (
apple_pie) has significantly fewer images (257) — this may lead to class imbalance during training. - Dataset is well-suited for image classification tasks.
Image Size Analysis (width and height)¶
# Image Size Analysis (width and height)
import random
from PIL import Image
widths, heights = [], []
for cls, images in class_to_images.items():
sample_images = random.sample(images, min(20, len(images))) # sample 20 images per class
for img_name in sample_images:
img_path = os.path.join(base_path, cls, img_name)
with Image.open(img_path) as img:
w, h = img.size
widths.append(w)
heights.append(h)
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.hist(widths, bins=30, color='salmon', edgecolor='black')
plt.title('Distribution of Image Widths')
plt.xlabel('Width (pixels)')
plt.ylabel('Count')
plt.subplot(1, 2, 2)
plt.hist(heights, bins=30, color='lightgreen', edgecolor='black')
plt.title('Distribution of Image Heights')
plt.xlabel('Height (pixels)')
plt.ylabel('Count')
plt.tight_layout()
plt.show()
Image Size Distribution Observation
Most images are 512x512 pixels.
- This indicates the dataset is already quite standardized.
A few images have smaller dimensions (e.g., 300, 350 pixels).
- These are outliers and occur rarely.
This consistency is useful for model training.
- We can resize all images to 512x512 or a smaller fixed size (like 224x224) for deep learning models.
No very large or very small images were found.
- This ensures minimal image distortion during preprocessing.
Visualize the data, showing one image per class from 101 classes¶
# Visualize the data, showing one image per class from 101 classes
# Path to dataset
data_dir = food101_dir # Assuming `food101_dir` is already defined
foods_sorted = sorted([
d for d in os.listdir(data_dir)
if os.path.isdir(os.path.join(data_dir, d))
])
# Total number of classes
num_classes = len(foods_sorted)
# Dynamically define grid size
cols = 6
rows = int(np.ceil(num_classes / cols))
# Create subplots
fig, ax = plt.subplots(rows, cols, figsize=(4 * cols, 4 * rows))
fig.suptitle("Showing one random image from each class", y=1.02, fontsize=24)
# Flatten axes for easier iteration (in case rows * cols > num_classes)
ax = ax.flatten()
for food_id, food_name in enumerate(foods_sorted):
food_images = os.listdir(os.path.join(data_dir, food_name))
random_img = np.random.choice(food_images)
img_path = os.path.join(data_dir, food_name, random_img)
img = plt.imread(img_path)
ax[food_id].imshow(img)
ax[food_id].set_title(food_name, pad=10)
ax[food_id].axis('off')
# Hide any extra axes if there are unused subplots
for i in range(num_classes, len(ax)):
ax[i].axis('off')
plt.tight_layout()
plt.subplots_adjust(top=0.93) # Leave room for suptitle
plt.show()
Step 2: Map training and testing images to its classes.¶
from sklearn.model_selection import train_test_split
# Adjust path as needed
base_path = 'food101_data/Food_101'
# Get class names from folder names
class_names = sorted([folder for folder in os.listdir(base_path) if os.path.isdir(os.path.join(base_path, folder))])
food_data = []
# Collect image path and class label
for label in class_names:
folder_path = os.path.join(base_path, label)
for img_file in os.listdir(folder_path):
if img_file.lower().endswith(('.jpg', '.jpeg', '.png')):
img_path = os.path.join(folder_path, img_file)
food_data.append((img_path, label))
# Create DataFrame
food_df = pd.DataFrame(food_data, columns=['image_path', 'label'])
# Split into train/test (80/20)
train_food_df, test_food_df = train_test_split(food_df, test_size=0.2, stratify=food_df['label'], random_state=42)
print("✅ Mapped images to classes.")
print(f"Train: {len(train_food_df)} images, Test: {len(test_food_df)} images")
train_food_df.head()
✅ Mapped images to classes. Train: 13004 images, Test: 3252 images
| image_path | label | |
|---|---|---|
| 2230 | food101_data/Food_101/donuts/2249805.jpg | donuts |
| 12195 | food101_data/Food_101/samosa/1145678.jpg | samosa |
| 13392 | food101_data/Food_101/strawberry_shortcake/225... | strawberry_shortcake |
| 13828 | food101_data/Food_101/strawberry_shortcake/354... | strawberry_shortcake |
| 10269 | food101_data/Food_101/ravioli/788592.jpg | ravioli |
food_df
| image_path | label | |
|---|---|---|
| 0 | food101_data/Food_101/apple_pie/2968812.jpg | apple_pie |
| 1 | food101_data/Food_101/apple_pie/3134347.jpg | apple_pie |
| 2 | food101_data/Food_101/apple_pie/3314985.jpg | apple_pie |
| 3 | food101_data/Food_101/apple_pie/3670548.jpg | apple_pie |
| 4 | food101_data/Food_101/apple_pie/3917257.jpg | apple_pie |
| ... | ... | ... |
| 16251 | food101_data/Food_101/waffles/764669.jpg | waffles |
| 16252 | food101_data/Food_101/waffles/113651.jpg | waffles |
| 16253 | food101_data/Food_101/waffles/2364175.jpg | waffles |
| 16254 | food101_data/Food_101/waffles/3844038.jpg | waffles |
| 16255 | food101_data/Food_101/waffles/1576252.jpg | waffles |
16256 rows × 2 columns
Step 3: Create annotations for training and testing images.¶
[Take any 10 foods(class) of your choice and select any 50 images inside each food and create the annotations manually. You can use any image annotation tool to get the coordinates.]
Image Annotation Overview:
To train a model for object detection (such as YOLO, SSD, or Faster R-CNN), we’ve created annotations for selected food classes. These annotations are saved in a CSV file and follow a structured format suitable for model training.
Annotation Task Details
We selected 10 food classes of our choice. wihch is
- French Fries
- Apple Pie
- Nachos
- Pizza
- Pancakes
- Tacos
- Chocolate Cake
- Hot Dog
- Onion Rings
- Spring Roll
For each food class, we manually annotated 50-60 images.
We used an image annotation tool(Roboflow) to mark bounding boxes (object locations).
The annotation data is saved in a file: Datasetv1/original_images/_annotations.csv
Annotation File Structure:
The CSV file contains the following columns:
| Column | Description |
|---|---|
filename |
Name of the image file (e.g., pizza_01.jpg) |
width |
Width of the image in pixels |
height |
Height of the image in pixels |
class |
Name of the object class (e.g., pizza, samosa, etc.) |
xmin |
X-coordinate of the top-left corner of the bounding box |
ymin |
Y-coordinate of the top-left corner of the bounding box |
xmax |
X-coordinate of the bottom-right corner of the bounding box |
ymax |
Y-coordinate of the bottom-right corner of the bounding box |
This format is commonly used in object detection datasets to describe the position and size of objects within each image.
File & Folder Paths:
Below are the paths used for image data and annotations:
Path to the annotation file
Datasetv1/original_images/_annotations.csvFolder containing the corresponding images
Datasetv1/original_images/
Step 4: Display images with bounding box we have created manually in the previous step.¶
# Path to the CSV file containing image annotations (e.g., bounding boxes, labels)
csv_path = 'Datasetv1/original_images/_annotations.csv'
# Path to the folder where the original images are stored
img_folder = 'Datasetv1/original_images/'
EDA On Annotated Data¶
# Load annotations
food_annotations_df = pd.read_csv(csv_path)
# Display the shape of the DataFrame to check the number of rows and columns
food_annotations_df.shape
(558, 8)
# Display the entire DataFrame to inspect the data including any new columns added
food_annotations_df
| filename | width | height | class | xmin | ymin | xmax | ymax | |
|---|---|---|---|---|---|---|---|---|
| 0 | 2909830_jpg.rf.bb9125215f38f22139f72d04f19e693... | 512 | 512 | Apple Pie | 210 | 43 | 397 | 259 |
| 1 | 108743_jpg.rf.260978b4f8ae78f4ebb41f48ef501679... | 512 | 384 | French Fries | 50 | 3 | 442 | 383 |
| 2 | 149278_jpg.rf.86187fd5bd1698133cb7a973c6060449... | 512 | 384 | French Fries | 33 | 0 | 260 | 167 |
| 3 | 2986199_jpg.rf.ac0b99e71100520e6608ef72b12ee27... | 512 | 512 | Apple Pie | 28 | 37 | 291 | 233 |
| 4 | 2934928_jpg.rf.c8f427a0d3e7ba9342fe37276fb15ab... | 512 | 512 | Apple Pie | 9 | 54 | 463 | 465 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 553 | 30292-hotdog_jpg.rf.0390f5521fb9e6e7e3acb2a6a8... | 640 | 640 | Hotdog | 56 | 0 | 640 | 605 |
| 554 | 14043-hotdog_jpg.rf.8336579be067ac62410422f411... | 640 | 640 | Hotdog | 48 | 124 | 404 | 526 |
| 555 | 8006-hotdog_jpg.rf.2b5a43d73a7b80e624c778536e2... | 640 | 640 | Hotdog | 65 | 45 | 623 | 640 |
| 556 | 4345-hotdog_jpg.rf.c81f7d5ae5388487ceea9df4709... | 640 | 640 | Hotdog | 2 | 8 | 640 | 640 |
| 557 | 51643-hotdog_jpg.rf.2eeb177096d2f26e6f38322d53... | 640 | 640 | Hotdog | 3 | 26 | 483 | 607 |
558 rows × 8 columns
Display all unique food class¶
# Extract and display all unique food class names from the dataset
food_classes = set(food_annotations_df['class'])
print("List of unique food categories in the dataset:")
for food in sorted(food_classes):
print("-", food)
List of unique food categories in the dataset: - Apple Pie - Chocolate - French Fries - Hotdog - Nachos - Pizza - onion_rings - pancakes - spring_rolls - tacos
Check for duplicate filenames in the dataset¶
# Check for duplicate filenames in the dataset
duplicate_filenames = food_annotations_df[food_annotations_df.duplicated(subset='filename', keep=False)]
print(f"Total duplicate filenames found: {duplicate_filenames['filename'].nunique()}")
print("List of duplicated filenames:")
print(duplicate_filenames['filename'].value_counts())
Total duplicate filenames found: 32 List of duplicated filenames: filename 189678-nachos_jpg.rf.f186725dbfe1bc23e9532408103e1060.jpg 5 3004621_jpg.rf.1a70aad430f7fcc72cc14f91446d4c08.jpg 4 7394_jpg.rf.1838448cb2b3d641b167b9cfbca600cc.jpg 4 91964_jpg.rf.0c917d27d8f80e5c630140d81031d231.jpg 3 11193_jpg.rf.afefd57ffc19ba1eeb51afeee3bf37b4.jpg 3 113781_jpg.rf.de10ec12748947f00d231f8c55aaefb8.jpg 3 1030289_jpg.rf.702c29c39daf844a889cc73917369bdd.jpg 3 2618003_jpg.rf.8d18399346288665532d0826566a79eb.jpg 3 2861144_jpg.rf.a9287e2d7af886a3c026273c3349edba.jpg 3 36081_jpg.rf.bcde8146b7446e659e5d17e94d563635.jpg 2 1058697_jpg.rf.187204c8e93dbe0d20f8676a3f9f7c33.jpg 2 110171_jpg.rf.2e6a197703f7096765d773f023bda859.jpg 2 38615_jpg.rf.edfc43b51bb448e7763ffc9c6c3237c3.jpg 2 45817_jpg.rf.b4f80dfda9bea5836fedec2c7b65e578.jpg 2 62663_jpg.rf.d6e00a3b034bc15f515a5fa056ca1733.jpg 2 58787_jpg.rf.a8acae7e04404aeb8ad1c6a5f8b65434.jpg 2 145012_jpg.rf.4544abe395055b02ccd3e1076038f4ff.jpg 2 33259_jpg.rf.56a5b0558bdb03c426e60f6b5f89b8f4.jpg 2 78171_jpg.rf.4712e20db14395cc19199a4f927ec652.jpg 2 36370_jpg.rf.fc4e83fc5c0a333ddd949da6ac871995.jpg 2 62484_jpg.rf.7a9effc3895e6123dcf647b7f92549f6.jpg 2 92235_jpg.rf.53c19df7b5c9ec2f9d0ffcad8470c394.jpg 2 35235_jpg.rf.32771ba6dfe7c36611eee12e9a4076b6.jpg 2 71645_jpg.rf.7c1651d6851e2f6b318c16b37516c9e6.jpg 2 2983047_jpg.rf.0581d006429c601c3b014a9e4abe4b5c.jpg 2 74527_jpg.rf.a53136bdf4e575d077f34c3c1a41b50a.jpg 2 110385_jpg.rf.ed897b8ba0e20976351d7e0777963d00.jpg 2 80540_jpg.rf.134fc69263831ead08ce2f8a43ac5644.jpg 2 1126_jpg.rf.d3ba4b55b4bf612e7af22ea7ff137788.jpg 2 68177_jpg.rf.4286d561950cc21283c4e2b372092ac1.jpg 2 101450_jpg.rf.eddcc68593aa541ba3d9cce8835094be.jpg 2 95572_jpg.rf.a47685e871481cef6935b90644ff7ba5.jpg 2 Name: count, dtype: int64
Remove duplicate rows based on filename¶
# Remove duplicate rows based on filename, keeping the first occurrence
food_annotations_df = food_annotations_df.drop_duplicates(subset='filename', keep='first').reset_index(drop=True)
print(f"Duplicate rows removed. New shape of DataFrame: {food_annotations_df.shape}")
Duplicate rows removed. New shape of DataFrame: (513, 8)
Show the distribution of samples across different food classes¶
# Show the distribution of samples across different food classes
class_counts = food_annotations_df['class'].value_counts()
print("Food class distribution (class: count):")
for class_name, count in class_counts.items():
print(f"- {class_name}: {count}")
Food class distribution (class: count): - pancakes: 57 - spring_rolls: 53 - tacos: 52 - French Fries: 51 - onion_rings: 51 - Pizza: 50 - Nachos: 50 - Chocolate: 50 - Hotdog: 50 - Apple Pie: 49
Display summary statistics about the dataset¶
# Display summary statistics about the dataset
total_annotations = len(food_annotations_df)
unique_images = food_annotations_df['filename'].nunique()
unique_classes = food_annotations_df['class'].nunique()
print("Dataset Summary:")
print(f"- Total annotations : {total_annotations}")
print(f"- Unique image files : {unique_images}")
print(f"- Number of food classes : {unique_classes}")
Dataset Summary: - Total annotations : 513 - Unique image files : 513 - Number of food classes : 10
Observation :¶
The dictionary maps each food class name to a unique integer index from 0 to 9, following the alphabetical order of class names.
Class names like 'Apple Pie' and 'Chocolate' come first as they are alphabetically earlier.
The mapping is case-sensitive and sorted lexicographically, so lowercase names like 'onion_rings', 'pancakes', 'spring_rolls', and 'tacos' appear after the capitalized ones due to ASCII sorting rules.
This consistent and reproducible mapping is essential for:
Encoding labels during model training.
Decoding predictions back to readable class names.
With 10 classes total, this dictionary covers all classes with unique indices and no duplicates or missing entries.
Function to display bounding boxes¶
# Function to display bounding boxes for specified classes
def show_bboxes(df, n=5, classes_to_show=None):
# Filter by class if specified
if classes_to_show:
#classes_to_show = [cls.lower().replace(" ", "_") for cls in classes_to_show]
#df['class'] = df['class'].str.lower()
filtered_df = df[df['class'].isin(classes_to_show)]
if filtered_df.empty:
print(f"⚠️ No images found for classes: {classes_to_show}")
return
else:
filtered_df = df
img_files = filtered_df['filename'].unique()
total = min(n, len(img_files))
# Prepare grid layout (e.g., 5 images in 1 row)
fig, axes = plt.subplots(1, total, figsize=(5 * total, 5))
# If only one image, axes is not iterable
if total == 1:
axes = [axes]
for idx in range(total):
img_file = img_files[idx]
img_path = os.path.join(img_folder, img_file)
if not os.path.exists(img_path):
print(f"❌ Image not found: {img_path}")
continue
img = cv2.imread(img_path)
if img is None:
print(f"⚠️ Unable to read image: {img_file}")
continue
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
ax = axes[idx]
ax.imshow(img_rgb)
# Draw all boxes for the current image
for _, row in filtered_df[filtered_df['filename'] == img_file].iterrows():
x_min, y_min, x_max, y_max = int(row['xmin']), int(row['ymin']), int(row['xmax']), int(row['ymax'])
label = row['class']
rect = patches.Rectangle((x_min, y_min), x_max - x_min, y_max - y_min,
linewidth=2, edgecolor='red', facecolor='none')
ax.add_patch(rect)
ax.text(x_min, y_min - 5, label, color='red', fontsize=10, backgroundcolor='white')
ax.axis('off')
ax.set_title(f"{img_file}", fontsize=10)
plt.tight_layout()
plt.show()
Display images with bounding box¶
# Show 5 images with boxes only for 'Apple Pie'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Apple Pie'])
# Show 5 images with boxes only for 'French Fries'
show_bboxes(food_annotations_df, n=5, classes_to_show=['French Fries'])
# Show 5 images with boxes only for 'pancakes'
show_bboxes(food_annotations_df, n=5, classes_to_show=['pancakes'])
# Show 5 images with boxes only for 'tacos'
show_bboxes(food_annotations_df, n=5, classes_to_show=['tacos'])
# Show 5 images with boxes only for 'Pizza'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Pizza'])
# Show 5 images with boxes only for 'Nachos'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Nachos'])
# Show 5 images with boxes only for 'onion_rings'
show_bboxes(food_annotations_df, n=5, classes_to_show=['onion_rings'])
# Show 5 images with boxes only for 'spring_rolls'
show_bboxes(food_annotations_df, n=5, classes_to_show=['spring_rolls'])
# Show 5 images with boxes only for 'hot_dog'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Hotdog'])
# Show 5 images with boxes only for 'chocolate_cake'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Chocolate'])
Step 5: Design, train and test basic CNN models to classify the flood.¶
Utilities Functions¶
import random
import matplotlib.pyplot as plt
import numpy as np
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
from pandas import DataFrame
from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping
def train_model(model, X_train, y_train, X_val, y_val, epochs=50, batch_size=32, filepath='model_best.weights.h5'):
checkpointer = ModelCheckpoint(
filepath=filepath,
verbose=1,
save_best_only=True,
save_weights_only=True
)
earlystopping = EarlyStopping(
monitor='val_loss',
min_delta=0.01,
patience=20,
mode='auto'
)
reduceLR = ReduceLROnPlateau(
monitor='val_loss',
factor=0.5,
patience=10,
mode='auto'
)
history = model.fit(
X_train, y_train,
validation_data=(X_val, y_val),
epochs=epochs,
batch_size=batch_size,
callbacks=[checkpointer, reduceLR, earlystopping],
verbose=1
)
return history
def plot_training_history(history, model, X_test, y_test=None, model_name="Model"):
"""
Plot training and validation metrics from model history,
and evaluate accuracy/loss on test data.
Args:
history: History object returned from model.fit()
model: Trained Keras model
X_test: Test feature set
y_test: Test labels
model_name: Name of the model for the plot title
"""
# Create figure with two subplots
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Plot accuracy
ax1.plot(history.history['accuracy'], label='Training Accuracy')
ax1.plot(history.history['val_accuracy'], label='Validation Accuracy')
ax1.set_title(f'{model_name} - Accuracy')
ax1.set_xlabel('Epoch')
ax1.set_ylabel('Accuracy')
ax1.legend()
ax1.grid(True)
# Plot loss
ax2.plot(history.history['loss'], label='Training Loss')
ax2.plot(history.history['val_loss'], label='Validation Loss')
ax2.set_title(f'{model_name} - Loss')
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Loss')
ax2.legend()
ax2.grid(True)
plt.tight_layout()
plt.show()
# Print final training/validation metrics
print(f"\n🔍 Final Epoch Metrics:")
print(f"📈 Training Accuracy : {history.history['accuracy'][-1]:.2f}")
print(f"📉 Training Loss : {history.history['loss'][-1]:.2f}")
print(f"📈 Validation Accuracy : {history.history['val_accuracy'][-1]:.2f}")
print(f"📉 Validation Loss : {history.history['val_loss'][-1]:.4f}")
# Evaluate on test data
test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=0)
print(f"\n🧪 Test Accuracy : {test_accuracy:.2f}")
print(f"🧪 Test Loss : {test_loss:.2f}")
def evaluate_classification_model(model, X_test, y_test, y_train=None):
"""
Evaluate a classification model: prints classification report and shows confusion matrix.
Parameters:
- model: Trained Keras model
- X_test: Test features
- y_test: True labels (can be one-hot or class indices)
- y_train: (Optional) Training labels to ensure LabelEncoder covers all classes
"""
# Ensure X_test is a NumPy array with dtype float32
X_test = np.array(X_test).astype(np.float32) # 🔧 Fix applied here
# Predict class probabilities
y_pred_probs = model.predict(X_test)
# Get predicted class indices
y_pred_class = np.argmax(y_pred_probs, axis=1)
# Convert y_test to class indices if one-hot encoded
if y_test.ndim > 1 and y_test.shape[1] > 1:
y_test_class = np.argmax(y_test, axis=1)
else:
y_test_class = y_test.ravel().astype(int)
# Fit LabelEncoder on combined labels if y_train is provided
if y_train is not None:
all_labels = np.concatenate([y_train.ravel(), y_test_class])
else:
all_labels = y_test_class
label_encoder = LabelEncoder()
label_encoder.fit(all_labels)
# Decode predicted and true labels to class names
y_test_labels = label_encoder.inverse_transform(y_test_class.astype(int))
y_pred_labels = label_encoder.inverse_transform(y_pred_class.astype(int))
class_names = sorted(food_annotations_df['class'].unique())
# Print classification report
print("Classification Report:")
print(classification_report(y_test_labels, y_pred_labels, target_names=class_names, zero_division=0))
# Confusion Matrix
conf_mat = confusion_matrix(y_test_class, y_pred_class, labels=label_encoder.classes_)
# Plot Confusion Matrix
plt.figure(figsize=(10, 8))
sns.heatmap(conf_mat, annot=True, fmt='d', xticklabels=class_names, yticklabels=class_names, cmap='Blues')
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title('Confusion Matrix')
plt.show()
def plot_random_predictions(X_test, y_test, class_names, model, num_samples=5):
"""
Plots random test samples with predicted and actual labels, showing 5 images per row max.
Correct predictions are shown in green, incorrect in red.
Args:
X_test (np.array): Test images, shape (N, H, W, C)
y_test (np.array): One-hot encoded labels, shape (N, num_classes)
class_names (list): List of class names corresponding to label indices
model (keras.Model): Trained classification model
num_samples (int): Number of random samples to display (default: 5)
"""
indices = random.sample(range(len(X_test)), num_samples)
cols = 5
rows = (num_samples + cols - 1) // cols # Ceiling division to get rows
plt.figure(figsize=(cols * 3, rows * 3)) # Adjust figure size
for i, idx in enumerate(indices):
img = X_test[idx]
true_label = np.argmax(y_test[idx])
pred_label = np.argmax(model.predict(np.expand_dims(img, axis=0), verbose=0))
color = 'green' if pred_label == true_label else 'red'
title_text = f"Pred: {class_names[pred_label]}\nActual: {class_names[true_label]}"
plt.subplot(rows, cols, i + 1)
plt.imshow(img)
plt.title(title_text, color=color, fontsize=10)
plt.axis('off')
plt.suptitle("Model Predictions on Random Test Images", fontsize=16)
plt.tight_layout()
plt.subplots_adjust(top=0.85) # Make space for suptitle
plt.show()
Step 5.1 Build Basic CNN 1 (Improved CNN)¶
Step 5.1.1:Preprocess Data¶
# Import train_test_split to split data into training and testing sets with optional stratification
from sklearn.model_selection import train_test_split
# Import to_categorical to convert integer labels into one-hot encoded format for classification models
from tensorflow.keras.utils import to_categorical
# Import img_to_array to convert PIL Images or numpy arrays to proper array format for model input
from tensorflow.keras.preprocessing.image import img_to_array
# Extract all unique food class names from the 'class' column in the annotations DataFrame,
# then sort them alphabetically to create a consistent ordered list of class names
class_names = sorted(food_annotations_df['class'].unique())
class_names
['Apple Pie', 'Chocolate', 'French Fries', 'Hotdog', 'Nachos', 'Pizza', 'onion_rings', 'pancakes', 'spring_rolls', 'tacos']
Resize the Image and Load images and it's corresponding labels¶
# --- Load images and corresponding labels ---
img_folder = 'Datasetv1/original_images/'
images = []
labels = []
for _, row in food_annotations_df.iterrows():
img_path = os.path.join(img_folder, row['filename'])
img = cv2.imread(img_path)
if img is not None:
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Convert from BGR to RGB
img = cv2.resize(img, (128, 128)) # Resize to 128x128
#img = img_to_array(img) / 255.0 # Normalize to [0, 1]
images.append(img)
labels.append(row['class'])
# --- Convert lists of images and labels to NumPy arrays ---
X = np.array(images)
y = np.array(labels)
# Display the shapes of the feature and label arrays
print(f"Shape of image data (X): {X.shape}")
print(f"Shape of label data (y): {y.shape}")
Shape of image data (X): (513, 128, 128, 3) Shape of label data (y): (513,)
y
array(['Apple Pie', 'French Fries', 'French Fries', 'Apple Pie',
'Apple Pie', 'French Fries', 'Apple Pie', 'Apple Pie',
'French Fries', 'Apple Pie', 'French Fries', 'French Fries',
'Apple Pie', 'Apple Pie', 'Apple Pie', 'French Fries', 'Apple Pie',
'French Fries', 'French Fries', 'French Fries', 'French Fries',
'Apple Pie', 'French Fries', 'French Fries', 'French Fries',
'French Fries', 'Apple Pie', 'Apple Pie', 'Apple Pie',
'French Fries', 'French Fries', 'Apple Pie', 'Apple Pie',
'Apple Pie', 'Apple Pie', 'French Fries', 'Apple Pie',
'French Fries', 'Apple Pie', 'French Fries', 'Apple Pie',
'French Fries', 'French Fries', 'French Fries', 'French Fries',
'French Fries', 'French Fries', 'Apple Pie', 'Apple Pie',
'Apple Pie', 'French Fries', 'Apple Pie', 'French Fries',
'French Fries', 'French Fries', 'French Fries', 'Apple Pie',
'Apple Pie', 'Apple Pie', 'Apple Pie', 'Apple Pie', 'Apple Pie',
'French Fries', 'French Fries', 'French Fries', 'Apple Pie',
'French Fries', 'Apple Pie', 'Apple Pie', 'French Fries',
'French Fries', 'Apple Pie', 'Apple Pie', 'French Fries',
'French Fries', 'Apple Pie', 'Apple Pie', 'Apple Pie',
'French Fries', 'French Fries', 'French Fries', 'Apple Pie',
'Apple Pie', 'French Fries', 'Apple Pie', 'French Fries',
'French Fries', 'French Fries', 'French Fries', 'Apple Pie',
'French Fries', 'Apple Pie', 'French Fries', 'Apple Pie',
'French Fries', 'Apple Pie', 'Apple Pie', 'Apple Pie',
'French Fries', 'Apple Pie', 'pancakes', 'pancakes', 'pancakes',
'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
'pancakes', 'pancakes', 'pancakes', 'pancakes', 'tacos', 'tacos',
'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
'tacos', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
'Pizza', 'Pizza', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
'Nachos', 'Nachos', 'Nachos', 'onion_rings', 'onion_rings',
'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
'onion_rings', 'spring_rolls', 'spring_rolls', 'spring_rolls',
'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
'spring_rolls', 'spring_rolls', 'Chocolate', 'Chocolate',
'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
'Chocolate', 'Chocolate', 'Chocolate', 'Hotdog', 'Hotdog',
'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog'],
dtype='<U12')
# --- Convert to NumPy arrays ---
#X = np.array(images)
#y = to_categorical(labels, num_classes=len(class_names)) # One-hot encode the labels
Verifying an image and its label after splitting into X (images) and y (target labels), and your y contains target labels¶
import matplotlib.pyplot as plt
import random
# Number of images to display
num_display = 5
# Randomly pick image indices
indices = random.sample(range(len(images)), num_display)
plt.figure(figsize=(15, 5))
for i, idx in enumerate(indices):
plt.subplot(1, num_display, i + 1)
plt.imshow(images[idx])
class_name = y[idx] # Get class name using label index
plt.title(f"Label: {class_name}")
plt.axis('off')
plt.suptitle("Sample Images with Labels", fontsize=16)
plt.tight_layout()
plt.show()
Encode class labels¶
# Encode labels to integers first
from sklearn.preprocessing import LabelEncoder
label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)
# print summary
print("Labels encoded successfully.")
print(f"Number of classes: {len(label_encoder.classes_)}")
Labels encoded successfully. Number of classes: 10
# Get all unique class labels (original) and their encoded values
class_names = label_encoder.classes_
print("Label Mapping (Original Label → Encoded Index):")
for idx, label in enumerate(class_names):
print(f"{idx}: {label}")
Label Mapping (Original Label → Encoded Index): 0: Apple Pie 1: Chocolate 2: French Fries 3: Hotdog 4: Nachos 5: Pizza 6: onion_rings 7: pancakes 8: spring_rolls 9: tacos
Train Test Split:¶
# Split into train and temp sets (80% train, 20% temp), with stratification
X_train, X_temp, y_train_encoded, y_temp_encoded = train_test_split(
X, y_encoded, test_size=0.2, random_state=42, stratify=y_encoded
)
# Split temp into validation and test (each 10% of total), with stratification
X_valid, X_test, y_valid_encoded, y_test_encoded = train_test_split(
X_temp, y_temp_encoded, test_size=0.5, random_state=42, stratify=y_temp_encoded
)
# One-hot encode the labels
y_train = to_categorical(y_train_encoded)
y_valid = to_categorical(y_valid_encoded)
y_test = to_categorical(y_test_encoded)
# Print the shapes of the splits
print("Dataset Split Summary:")
print(f"Train set → X: {X_train.shape}, y: {y_train.shape}")
print(f"Validation → X: {X_valid.shape}, y: {y_valid.shape}")
print(f"Test set → X: {X_test.shape}, y: {y_test.shape}")
Dataset Split Summary: Train set → X: (410, 128, 128, 3), y: (410, 10) Validation → X: (51, 128, 128, 3), y: (51, 10) Test set → X: (52, 128, 128, 3), y: (52, 10)
Verify image-label mapping after splitting¶
print(label_encoder.classes_)
['Apple Pie' 'Chocolate' 'French Fries' 'Hotdog' 'Nachos' 'Pizza' 'onion_rings' 'pancakes' 'spring_rolls' 'tacos']
np.argmax(y_train[1])
7
# ------------------------------
# Display a Random Training Image with its Label
# ------------------------------
def show_samples(X, y, class_names, num_samples=5):
plt.figure(figsize=(15, 5))
for i in range(num_samples):
img = X[i]
label_idx = np.argmax(y[i]) # Convert one-hot label to index
plt.subplot(1, num_samples, i + 1)
plt.imshow(img)
plt.title(f"Label: {class_names[label_idx]}")
plt.axis('off')
plt.suptitle("Sample Training Images with Labels", fontsize=16)
plt.tight_layout()
plt.show()
# Call the function
show_samples(X_train, y_train, label_encoder.classes_)
# Check lengths
print(len(X_train), len(y_train)) # Should be equal
print(len(X_test), len(y_test)) # Should be equal
410 410 52 52
Check label distribution consistency¶
# Check label distribution consistency
import numpy as np
import collections
# Convert one-hot encoded labels to class indices
y_train_labels = np.argmax(y_train, axis=1)
y_test_labels = np.argmax(y_test, axis=1)
# Count label distribution
print("Train label distribution:", collections.Counter(y_train_labels))
print("Test label distribution:", collections.Counter(y_test_labels))
Train label distribution: Counter({7: 45, 8: 42, 9: 42, 6: 41, 2: 41, 4: 40, 1: 40, 3: 40, 5: 40, 0: 39})
Test label distribution: Counter({7: 6, 8: 6, 4: 5, 6: 5, 2: 5, 3: 5, 1: 5, 9: 5, 0: 5, 5: 5})
import matplotlib.pyplot as plt
# Count label distribution
train_counts = collections.Counter(y_train_labels)
test_counts = collections.Counter(y_test_labels)
# Sort labels for consistent plotting
labels = sorted(train_counts.keys())
# Get counts in sorted order
train_values = [train_counts[label] for label in labels]
test_values = [test_counts[label] for label in labels]
# Plotting
x = np.arange(len(labels))
width = 0.35
plt.figure(figsize=(12, 6))
plt.bar(x - width/2, train_values, width, label='Train', color='skyblue')
plt.bar(x + width/2, test_values, width, label='Test', color='salmon')
plt.xlabel('Class Label')
plt.ylabel('Number of Samples')
plt.title('Train vs Test Label Distribution')
plt.xticks(x, labels)
plt.legend()
plt.tight_layout()
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()
Observation :
- Dataset is fairly balanced, which is beneficial for model training, as it reduces the risk of bias toward any particular class.
Step 5.1.2: Build a Basic CNN¶
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.layers import Input
from tensorflow.keras.optimizers import Adam
# Define a simple CNN model for multi-class classification
basic_cnn_model_1 = Sequential([
Input(shape=(128, 128, 3)), # Input layer specifying image size and channels (RGB)
# First convolution + pooling block
Conv2D(32, (3, 3), activation='relu'),
MaxPooling2D((2, 2)),
# Second convolution + pooling block
Conv2D(64, (3, 3), activation='relu'),
MaxPooling2D((2, 2)),
# Third convolution + pooling block
Conv2D(128, (3, 3), activation='relu'),
MaxPooling2D((2, 2)),
# Flatten feature maps to a 1D vector for dense layers
Flatten(),
# Fully connected layer with 128 neurons
Dense(128, activation='relu'),
Dropout(0.5), # Dropout for regularization to prevent overfitting
# Output layer with number of classes and softmax activation
Dense(len(class_names), activation='softmax')
])
# Compile the model with Adam optimizer, categorical crossentropy loss for multi-class, and accuracy metric
basic_cnn_model_1.compile(
optimizer=Adam(learning_rate=0.001),
loss='categorical_crossentropy',
metrics=['accuracy']
)
# Print model architecture summary
basic_cnn_model_1.summary()
Model: "sequential_67"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d_260 (Conv2D) │ (None, 126, 126, 32) │ 896 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_253 │ (None, 63, 63, 32) │ 0 │ │ (MaxPooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_261 (Conv2D) │ (None, 61, 61, 64) │ 18,496 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_254 │ (None, 30, 30, 64) │ 0 │ │ (MaxPooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_262 (Conv2D) │ (None, 28, 28, 128) │ 73,856 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_255 │ (None, 14, 14, 128) │ 0 │ │ (MaxPooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten_50 (Flatten) │ (None, 25088) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_135 (Dense) │ (None, 128) │ 3,211,392 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_182 (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_136 (Dense) │ (None, 10) │ 1,290 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 3,305,930 (12.61 MB)
Trainable params: 3,305,930 (12.61 MB)
Non-trainable params: 0 (0.00 B)
5.1.3: Train the Model¶
# Train the model
# history = basic_cnn_model_1.fit(
# X_train, # Training input features (e.g., images, text sequences)
# y_train,
# validation_data=(X_val, y_val), # Corresponding training labels
# epochs=10, # Number of times the model will iterate over the entire training data
# batch_size=16, # Number of samples the model processes before updating weights
# #validation_split=0.2 # Fraction of training data (20%) used for validation (i.e., 80% train, 20% validate)
# )
history = train_model(basic_cnn_model_1, X_train, y_train, X_valid, y_valid, epochs=20, batch_size=16)
Epoch 1/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step - accuracy: 0.2579 - loss: 2.0964 Epoch 1: val_loss improved from inf to 2.23658, saving model to model_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 63ms/step - accuracy: 0.2568 - loss: 2.0971 - val_accuracy: 0.1765 - val_loss: 2.2366 - learning_rate: 2.5000e-04 Epoch 2/20 26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step - accuracy: 0.3203 - loss: 1.9839 Epoch 2: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 53ms/step - accuracy: 0.3204 - loss: 1.9822 - val_accuracy: 0.1569 - val_loss: 2.2678 - learning_rate: 2.5000e-04 Epoch 3/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 51ms/step - accuracy: 0.4405 - loss: 1.7554 Epoch 3: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 53ms/step - accuracy: 0.4404 - loss: 1.7515 - val_accuracy: 0.1569 - val_loss: 2.3056 - learning_rate: 2.5000e-04 Epoch 4/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step - accuracy: 0.4877 - loss: 1.5367 Epoch 4: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 56ms/step - accuracy: 0.4904 - loss: 1.5339 - val_accuracy: 0.1373 - val_loss: 2.3733 - learning_rate: 2.5000e-04 Epoch 5/20 26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.6255 - loss: 1.2175 Epoch 5: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.6248 - loss: 1.2186 - val_accuracy: 0.1569 - val_loss: 2.4882 - learning_rate: 2.5000e-04 Epoch 6/20 26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.6634 - loss: 1.0593 Epoch 6: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.6643 - loss: 1.0581 - val_accuracy: 0.2353 - val_loss: 2.7572 - learning_rate: 2.5000e-04 Epoch 7/20 26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.7401 - loss: 0.8608 Epoch 7: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 50ms/step - accuracy: 0.7395 - loss: 0.8611 - val_accuracy: 0.1961 - val_loss: 3.0568 - learning_rate: 2.5000e-04 Epoch 8/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.8043 - loss: 0.7184 Epoch 8: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.8060 - loss: 0.7156 - val_accuracy: 0.1569 - val_loss: 3.6817 - learning_rate: 2.5000e-04 Epoch 9/20 26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.8389 - loss: 0.6055 Epoch 9: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.8388 - loss: 0.6054 - val_accuracy: 0.1373 - val_loss: 3.5692 - learning_rate: 2.5000e-04 Epoch 10/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step - accuracy: 0.8842 - loss: 0.4877 Epoch 10: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 55ms/step - accuracy: 0.8846 - loss: 0.4838 - val_accuracy: 0.1373 - val_loss: 4.1033 - learning_rate: 2.5000e-04 Epoch 11/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.8958 - loss: 0.3359 Epoch 11: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.8956 - loss: 0.3397 - val_accuracy: 0.1765 - val_loss: 3.5208 - learning_rate: 2.5000e-04 Epoch 12/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9099 - loss: 0.3405 Epoch 12: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9101 - loss: 0.3392 - val_accuracy: 0.1765 - val_loss: 3.7455 - learning_rate: 1.2500e-04 Epoch 13/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step - accuracy: 0.9513 - loss: 0.2061 Epoch 13: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 52ms/step - accuracy: 0.9506 - loss: 0.2076 - val_accuracy: 0.0980 - val_loss: 4.4924 - learning_rate: 1.2500e-04 Epoch 14/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9460 - loss: 0.2314 Epoch 14: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9460 - loss: 0.2312 - val_accuracy: 0.1373 - val_loss: 5.0421 - learning_rate: 1.2500e-04 Epoch 15/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step - accuracy: 0.9699 - loss: 0.1522 Epoch 15: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 52ms/step - accuracy: 0.9698 - loss: 0.1532 - val_accuracy: 0.0784 - val_loss: 4.6256 - learning_rate: 1.2500e-04 Epoch 16/20 26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.9668 - loss: 0.1342 Epoch 16: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9665 - loss: 0.1352 - val_accuracy: 0.1176 - val_loss: 4.5482 - learning_rate: 1.2500e-04 Epoch 17/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9572 - loss: 0.1650 Epoch 17: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9577 - loss: 0.1632 - val_accuracy: 0.1176 - val_loss: 4.9174 - learning_rate: 1.2500e-04 Epoch 18/20 26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9586 - loss: 0.1724 Epoch 18: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9586 - loss: 0.1723 - val_accuracy: 0.1176 - val_loss: 4.3938 - learning_rate: 1.2500e-04 Epoch 19/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step - accuracy: 0.9883 - loss: 0.0975 Epoch 19: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 52ms/step - accuracy: 0.9881 - loss: 0.0966 - val_accuracy: 0.1765 - val_loss: 5.4650 - learning_rate: 1.2500e-04 Epoch 20/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9696 - loss: 0.1042 Epoch 20: val_loss did not improve from 2.23658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 50ms/step - accuracy: 0.9703 - loss: 0.1042 - val_accuracy: 0.1765 - val_loss: 5.5215 - learning_rate: 1.2500e-04
Model Evaluation & Visualization¶
# Plot training history and evaluate the model on test data
# ---------------------------------------------------------
# history : The training history object returned by model.fit(), containing loss and accuracy over epochs
# basic_cnn_model_1 : The trained Keras model to be evaluated
# X_test, y_test : Test dataset used to evaluate model performance after training
# model_name : (Optional) Custom name for title/labeling plots and saving figures
plot_training_history(history, basic_cnn_model_1, X_test, y_test, model_name="Basic CNN 1")
🔍 Final Epoch Metrics: 📈 Training Accuracy : 0.98 📉 Training Loss : 0.10 📈 Validation Accuracy : 0.18 📉 Validation Loss : 5.5215 🧪 Test Accuracy : 0.23 🧪 Test Loss : 4.43
Based on the final epoch metrics, test performance, and epoch-wise training logs, here are detailed observations and insights about your model's training behavior:
Large gap between training and validation/test accuracy and diverging loss values indicate overfitting. Model memorizes training data but fails to generalize.
Observation:
- Training accuracy steadily improves.
- Validation accuracy stagnates (~39%) and validation loss increases after epoch 5-6, showing early overfitting.
- Best generalization observed around epochs 5-6
Conclusion:
Poor generalization on unseen data confirmed by low test accuracy and high test loss.
Summary of Issues
| Problem | Evidence |
|---|---|
| Overfitting | High train acc vs low val/test acc |
| Poor generalization | Test accuracy and loss worse than validation |
| Validation loss rise | Val loss increases after epoch 5 while train loss decreases |
| Model complexity | Model fits training data too well too quickly |
# Evaluate the trained classification model on the test set
# ----------------------------------------------------------
# basic_cnn_model_1 : The trained Keras model that will be evaluated
# X_test : Test feature data (e.g., images) for model prediction
# y_test : True labels (can be one-hot encoded or class indices) for evaluating predictions
# y_train : Optional — training labels used to fit LabelEncoder on all classes (helps preserve class label mapping)
evaluate_classification_model(basic_cnn_model_1, X_test, y_test, y_train=y_train)
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step Classification Report: precision recall f1-score support Apple Pie 0.12 0.20 0.15 5 Chocolate 0.20 0.20 0.20 5 French Fries 0.50 0.40 0.44 5 Hotdog 0.00 0.00 0.00 5 Nachos 0.00 0.00 0.00 5 Pizza 0.33 0.20 0.25 5 onion_rings 0.43 0.60 0.50 5 pancakes 0.33 0.33 0.33 6 spring_rolls 0.20 0.17 0.18 6 tacos 0.17 0.20 0.18 5 accuracy 0.23 52 macro avg 0.23 0.23 0.22 52 weighted avg 0.23 0.23 0.23 52
# Visualize predictions on random test images
# Arguments:
# - X_test : array of test images (preprocessed, shape like (N, H, W, C))
# - y_test : one-hot encoded true labels for test images
# - class_names : list of class label names corresponding to indices
# - basic_cnn_model_1 : trained classification model
# - num_samples : number of random samples to display (default is 5)
plot_random_predictions(X_test, y_test, class_names, basic_cnn_model_1, num_samples=20)
Classification Report Summary:
- Overall accuracy is low (~28%), showing the model struggles with correct predictions.
- Most classes have poor precision, recall, and F1-scores; the highest F1 is ~0.52 (Chocolate).
- Some classes (e.g., Hotdog) have zero precision and recall, meaning no correct predictions.
- Classes like Nachos show higher recall but low precision, indicating many false positives.
- The model likely underfits or lacks discriminative features.
- Recommendations:
- Increase dataset size or balance classes.
- Apply data augmentation.
- Use class weighting or sampling strategies.
- Tune or try more powerful models (e.g., transfer learning).
Observations on Confusion Matrix
This confusion matrix shows how well a model predicts food items. Here are some key points:
Good Predictions:
- The model is best at predicting "spring_rolls" (10 correct) and "tacos" (8 correct).
- "chocolate" (7 correct), "nachos" (7 correct), and "pancakes" (6 correct) are also predicted well.
Common Mistakes:
- "Apple Pie" is often confused with "pizza" (4 times) and "tacos" (4 times).
- "French Fries" is mistaken for "tacos" (3 times).
- "Hotdog" is confused with "tacos" (5 times).
- "onion_rings" is often predicted as "french_fries" (6 times).
Overall Performance:
- The model struggles with "Apple Pie" and "onion_rings" the most, as they have low correct predictions (2 and 4).
- Some classes like "pizza" and "hotdog" are confused with multiple other classes, showing the model has trouble distinguishing them.
The model does well for some foods but mixes up others, especially with "tacos" and "french_fries." It needs improvement for better accuracy.
Step 5.2 Build Basic CNN 2 (Improved CNN)¶
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D, Dense, Dropout, BatchNormalization, Input
from tensorflow.keras.regularizers import l2
from tensorflow.keras.optimizers import Adam
basic_cnn_model_2 = Sequential([
Input(shape=(128, 128, 3)),
Conv2D(32, (3, 3), activation='relu', padding='same'),
BatchNormalization(),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation='relu', padding='same'),
BatchNormalization(),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation='relu', padding='same'),
BatchNormalization(),
MaxPooling2D((2, 2)),
GlobalAveragePooling2D(),
Dense(128, activation='relu', kernel_regularizer=l2(0.001)),
Dropout(0.5),
Dense(len(class_names), activation='softmax')
])
basic_cnn_model_2.compile(optimizer=Adam(learning_rate=1e-4),
loss='categorical_crossentropy',
metrics=['accuracy'])
basic_cnn_model_2.summary()
Model: "sequential_73"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d_278 (Conv2D) │ (None, 128, 128, 32) │ 896 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_176 │ (None, 128, 128, 32) │ 128 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_271 │ (None, 64, 64, 32) │ 0 │ │ (MaxPooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_279 (Conv2D) │ (None, 64, 64, 64) │ 18,496 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_177 │ (None, 64, 64, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_272 │ (None, 32, 32, 64) │ 0 │ │ (MaxPooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_280 (Conv2D) │ (None, 32, 32, 64) │ 36,928 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_178 │ (None, 32, 32, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_273 │ (None, 16, 16, 64) │ 0 │ │ (MaxPooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ global_average_pooling2d_21 │ (None, 64) │ 0 │ │ (GlobalAveragePooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_147 (Dense) │ (None, 128) │ 8,320 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_191 (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_148 (Dense) │ (None, 10) │ 1,290 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 66,570 (260.04 KB)
Trainable params: 66,250 (258.79 KB)
Non-trainable params: 320 (1.25 KB)
basic_cnn_model_2_history = train_model(basic_cnn_model_2, X_train, y_train, X_valid, y_valid, epochs=20, batch_size=16, filepath='basic_cnn_model_2_best.weights.h5')
Epoch 1/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.0916 - loss: 2.6087 Epoch 1: val_loss improved from inf to 2.74920, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 3s 77ms/step - accuracy: 0.0918 - loss: 2.6050 - val_accuracy: 0.1373 - val_loss: 2.7492 - learning_rate: 1.0000e-04 Epoch 2/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.0902 - loss: 2.5299 Epoch 2: val_loss improved from 2.74920 to 2.50387, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.0918 - loss: 2.5261 - val_accuracy: 0.1373 - val_loss: 2.5039 - learning_rate: 1.0000e-04 Epoch 3/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 75ms/step - accuracy: 0.1166 - loss: 2.4303 Epoch 3: val_loss improved from 2.50387 to 2.43262, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 77ms/step - accuracy: 0.1167 - loss: 2.4307 - val_accuracy: 0.0980 - val_loss: 2.4326 - learning_rate: 1.0000e-04 Epoch 4/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.1419 - loss: 2.4111 Epoch 4: val_loss improved from 2.43262 to 2.38982, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.1433 - loss: 2.4101 - val_accuracy: 0.1176 - val_loss: 2.3898 - learning_rate: 1.0000e-04 Epoch 5/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.1307 - loss: 2.4276 Epoch 5: val_loss improved from 2.38982 to 2.36039, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.1311 - loss: 2.4269 - val_accuracy: 0.1176 - val_loss: 2.3604 - learning_rate: 1.0000e-04 Epoch 6/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.1826 - loss: 2.2419 Epoch 6: val_loss improved from 2.36039 to 2.35024, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.1820 - loss: 2.2447 - val_accuracy: 0.1569 - val_loss: 2.3502 - learning_rate: 1.0000e-04 Epoch 7/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 83ms/step - accuracy: 0.1655 - loss: 2.2877 Epoch 7: val_loss improved from 2.35024 to 2.33138, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 86ms/step - accuracy: 0.1645 - loss: 2.2910 - val_accuracy: 0.2549 - val_loss: 2.3314 - learning_rate: 1.0000e-04 Epoch 8/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 74ms/step - accuracy: 0.2436 - loss: 2.3232 Epoch 8: val_loss improved from 2.33138 to 2.29568, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 77ms/step - accuracy: 0.2427 - loss: 2.3215 - val_accuracy: 0.2157 - val_loss: 2.2957 - learning_rate: 1.0000e-04 Epoch 9/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 71ms/step - accuracy: 0.2186 - loss: 2.2519 Epoch 9: val_loss improved from 2.29568 to 2.27044, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 74ms/step - accuracy: 0.2171 - loss: 2.2536 - val_accuracy: 0.2549 - val_loss: 2.2704 - learning_rate: 1.0000e-04 Epoch 10/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.2218 - loss: 2.1913 Epoch 10: val_loss improved from 2.27044 to 2.26747, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.2209 - loss: 2.1928 - val_accuracy: 0.2549 - val_loss: 2.2675 - learning_rate: 1.0000e-04 Epoch 11/20 26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2001 - loss: 2.2228 Epoch 11: val_loss improved from 2.26747 to 2.24480, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 74ms/step - accuracy: 0.2007 - loss: 2.2219 - val_accuracy: 0.2745 - val_loss: 2.2448 - learning_rate: 1.0000e-04 Epoch 12/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 77ms/step - accuracy: 0.2629 - loss: 2.1499 Epoch 12: val_loss improved from 2.24480 to 2.22419, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 79ms/step - accuracy: 0.2630 - loss: 2.1508 - val_accuracy: 0.2353 - val_loss: 2.2242 - learning_rate: 1.0000e-04 Epoch 13/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2225 - loss: 2.2148 Epoch 13: val_loss improved from 2.22419 to 2.21153, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2237 - loss: 2.2109 - val_accuracy: 0.2549 - val_loss: 2.2115 - learning_rate: 1.0000e-04 Epoch 14/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2792 - loss: 2.1106 Epoch 14: val_loss improved from 2.21153 to 2.20730, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2779 - loss: 2.1116 - val_accuracy: 0.2745 - val_loss: 2.2073 - learning_rate: 1.0000e-04 Epoch 15/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2980 - loss: 2.1514 Epoch 15: val_loss improved from 2.20730 to 2.19979, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 73ms/step - accuracy: 0.2969 - loss: 2.1514 - val_accuracy: 0.2157 - val_loss: 2.1998 - learning_rate: 1.0000e-04 Epoch 16/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 71ms/step - accuracy: 0.2842 - loss: 2.1513 Epoch 16: val_loss improved from 2.19979 to 2.19116, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 73ms/step - accuracy: 0.2843 - loss: 2.1491 - val_accuracy: 0.2157 - val_loss: 2.1912 - learning_rate: 1.0000e-04 Epoch 17/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.2821 - loss: 2.0929 Epoch 17: val_loss improved from 2.19116 to 2.19076, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2809 - loss: 2.0935 - val_accuracy: 0.2157 - val_loss: 2.1908 - learning_rate: 1.0000e-04 Epoch 18/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2972 - loss: 2.1153 Epoch 18: val_loss improved from 2.19076 to 2.15424, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2982 - loss: 2.1121 - val_accuracy: 0.2549 - val_loss: 2.1542 - learning_rate: 1.0000e-04 Epoch 19/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2608 - loss: 2.0674 Epoch 19: val_loss improved from 2.15424 to 2.15281, saving model to basic_cnn_model_2_best.weights.h5 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2640 - loss: 2.0656 - val_accuracy: 0.2549 - val_loss: 2.1528 - learning_rate: 1.0000e-04 Epoch 20/20 25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.2518 - loss: 2.1404 Epoch 20: val_loss did not improve from 2.15281 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.2541 - loss: 2.1379 - val_accuracy: 0.2745 - val_loss: 2.1570 - learning_rate: 1.0000e-04
plot_training_history(basic_cnn_model_2_history, basic_cnn_model_2, X_test, y_test, model_name="Basic CNN 2")
🔍 Final Epoch Metrics: 📈 Training Accuracy : 0.28 📉 Training Loss : 2.11 📈 Validation Accuracy : 0.27 📉 Validation Loss : 2.1570 🧪 Test Accuracy : 0.29 🧪 Test Loss : 2.12
# Evaluate the trained classification model on the test set
# ----------------------------------------------------------
# basic_cnn_model_1 : The trained Keras model that will be evaluated
# X_test : Test feature data (e.g., images) for model prediction
# y_test : True labels (can be one-hot encoded or class indices) for evaluating predictions
# y_train : Optional — training labels used to fit LabelEncoder on all classes (helps preserve class label mapping)
evaluate_classification_model(basic_cnn_model_2, X_test, y_test, y_train=y_train)
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step Classification Report: precision recall f1-score support Apple Pie 0.50 0.20 0.29 5 Chocolate 0.30 0.60 0.40 5 French Fries 0.00 0.00 0.00 5 Hotdog 0.33 0.20 0.25 5 Nachos 0.20 0.20 0.20 5 Pizza 0.40 0.40 0.40 5 onion_rings 0.25 0.40 0.31 5 pancakes 0.67 0.33 0.44 6 spring_rolls 0.23 0.50 0.32 6 tacos 0.00 0.00 0.00 5 accuracy 0.29 52 macro avg 0.29 0.28 0.26 52 weighted avg 0.29 0.29 0.26 52
test_loss, test_acc = basic_cnn_model_2.evaluate(X_test, y_test)
print(f"Test Accuracy: {test_acc * 100:.2f}%")
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 37ms/step - accuracy: 0.2965 - loss: 2.1394 Test Accuracy: 28.85%
# Visualize predictions on random test images
# Arguments:
# - X_test : array of test images (preprocessed, shape like (N, H, W, C))
# - y_test : one-hot encoded true labels for test images
# - class_names : list of class label names corresponding to indices
# - basic_cnn_model_1 : trained classification model
# - num_samples : number of random samples to display (default is 5)
plot_random_predictions(X_test, y_test, class_names, basic_cnn_model_1, num_samples=20)
Basic CNN Model 3 With Data Augmentation¶
import pandas as pd
# Create a copy of the original annotations DataFrame to avoid modifying it directly
# Group by 'filename' to aggregate all rows for each unique image filename
# For each group, take the first row (useful if there are multiple annotations per image)
# Reset the index so the result is a clean DataFrame with default integer indexing
annotations_df = food_annotations_df.copy().groupby('filename').first().reset_index()
# Display or inspect the processed annotations DataFrame
# This DataFrame contains one row per unique filename
# Each row corresponds to the first annotation found for that image file
annotations_df
| filename | width | height | class | xmin | ymin | xmax | ymax | label | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 0301-hotdog_jpg.rf.d9d8524fb7b25b7e1549de49ab6... | 640 | 640 | Hotdog | 137 | 21 | 619 | 616 | 3 |
| 1 | 0909-hotdog_jpg.rf.92cd083aea53111f94e0b935d6a... | 640 | 640 | Hotdog | 151 | 182 | 543 | 599 | 3 |
| 2 | 100148_jpg.rf.2206fe7bfbb84220314498f35fae6bbb... | 512 | 384 | French Fries | 126 | 61 | 500 | 367 | 2 |
| 3 | 100284-nachos_jpg.rf.d8125a679ab339aaad0c57e48... | 640 | 640 | Nachos | 0 | 143 | 558 | 597 | 4 |
| 4 | 101450_jpg.rf.eddcc68593aa541ba3d9cce8835094be... | 512 | 512 | pancakes | 22 | 179 | 229 | 306 | 7 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 508 | 99074-nachos_jpg.rf.a0d38f7404682ad9fe4c2eb54a... | 640 | 640 | Nachos | 0 | 24 | 640 | 604 | 4 |
| 509 | 99076-nachos_jpg.rf.072c2c0791659ef3c42d13d950... | 640 | 640 | Nachos | 0 | 33 | 640 | 596 | 4 |
| 510 | 99087-nachos_jpg.rf.6996cfcf63d57f957a64256cec... | 640 | 640 | Nachos | 2 | 48 | 640 | 640 | 4 |
| 511 | 99088-nachos_jpg.rf.75fbb62b1beeceecb85275f7dd... | 640 | 640 | Nachos | 0 | 101 | 640 | 605 | 4 |
| 512 | 9949_jpg.rf.79c214a1830051934e6cfe1f557c6f86.jpg | 512 | 512 | pancakes | 33 | 182 | 486 | 458 | 7 |
513 rows × 9 columns
from sklearn.model_selection import train_test_split
train_df, temp_df = train_test_split(annotations_df, test_size=0.2, stratify=annotations_df['class'], random_state=42)
val_df, test_df = train_test_split(temp_df, test_size=0.5, stratify=temp_df['class'], random_state=42)
print(train_df.shape)
print(test_df.shape)
print(val_df.shape)
(410, 9) (52, 9) (51, 9)
# Check Label Distribution
train_df['class'].value_counts().plot(kind='bar')
<Axes: xlabel='class'>
Check Image Files Exist and Are Correctly Referenced
- Before using flow_from_dataframe(), you should verify that all image filenames in your dataframe actually exist in your image folde
import os
# Function to check if image files exist
def check_image_files(df, img_dir):
missing_files = []
for fname in df['filename']:
if not os.path.isfile(os.path.join(img_dir, fname)):
missing_files.append(fname)
return missing_files
# Check train, val, test dataframes
missing_train = check_image_files(train_df, img_folder)
missing_val = check_image_files(val_df, img_folder)
missing_test = check_image_files(test_df, img_folder)
print(f"Missing files in training data: {len(missing_train)}")
print(f"Missing files in validation data: {len(missing_val)}")
print(f"Missing files in test data: {len(missing_test)}")
if missing_train:
print("Some missing training images:", missing_train[:5])
if missing_val:
print("Some missing validation images:", missing_val[:5])
if missing_test:
print("Some missing test images:", missing_test[:5])
Missing files in training data: 0 Missing files in validation data: 0 Missing files in test data: 0
from tensorflow.keras.preprocessing.image import ImageDataGenerator
img_folder = 'Datasetv1/original_images/'
train_datagen = ImageDataGenerator(
rescale=1./255,
horizontal_flip=True,
vertical_flip=True,
rotation_range=20,
zoom_range=0.2,
shear_range=0.1,
width_shift_range=0.1, # added width shift
height_shift_range=0.1, # added height shift
fill_mode='nearest' # better filling of pixels after transformations
)
val_test_datagen = ImageDataGenerator(rescale=1./255)
# Flow from DataFrame
train_generator = train_datagen.flow_from_dataframe(
dataframe=train_df,
directory=img_folder,
x_col='filename',
y_col='class',
target_size=(128, 128),
class_mode='categorical',
batch_size=32,
shuffle=True
)
val_generator = val_test_datagen.flow_from_dataframe(
dataframe=val_df,
directory=img_folder,
x_col='filename',
y_col='class',
target_size=(128, 128),
class_mode='categorical',
batch_size=32,
shuffle=False
)
test_generator = val_test_datagen.flow_from_dataframe(
dataframe=test_df,
directory=img_folder,
x_col='filename',
y_col='class',
target_size=(128, 128),
class_mode='categorical',
batch_size=32,
shuffle=False
)
Found 410 validated image filenames belonging to 10 classes. Found 51 validated image filenames belonging to 10 classes. Found 52 validated image filenames belonging to 10 classes.
print(train_generator.class_indices)
print(test_generator.class_indices)
print(val_generator.class_indices)
{'Apple Pie': 0, 'Chocolate': 1, 'French Fries': 2, 'Hotdog': 3, 'Nachos': 4, 'Pizza': 5, 'onion_rings': 6, 'pancakes': 7, 'spring_rolls': 8, 'tacos': 9}
{'Apple Pie': 0, 'Chocolate': 1, 'French Fries': 2, 'Hotdog': 3, 'Nachos': 4, 'Pizza': 5, 'onion_rings': 6, 'pancakes': 7, 'spring_rolls': 8, 'tacos': 9}
{'Apple Pie': 0, 'Chocolate': 1, 'French Fries': 2, 'Hotdog': 3, 'Nachos': 4, 'Pizza': 5, 'onion_rings': 6, 'pancakes': 7, 'spring_rolls': 8, 'tacos': 9}
images, labels = next(train_generator)
print("Labels shape:", labels.shape)
print("Example label[0]:", labels[0])
Labels shape: (32, 10) Example label[0]: [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
# Get mapping from class index to label name
class_indices = train_generator.class_indices
inv_class_indices = {v: k for k, v in class_indices.items()}
num_images = 5
plt.figure(figsize=(15, 5))
for i in range(num_images):
ax = plt.subplot(1, num_images, i + 1)
plt.imshow(images[i])
class_index = np.argmax(labels[i])
class_label = inv_class_indices[class_index]
plt.title(f"{class_label}")
plt.axis("off")
plt.tight_layout()
plt.show()
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D, Dense, Dropout, BatchNormalization, Input
from tensorflow.keras.regularizers import l2
from tensorflow.keras.optimizers import Adam
basic_cnn_model_3 = Sequential([
Input(shape=(128, 128, 3)),
Conv2D(32, (3, 3), activation='relu', padding='same'),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation='relu', padding='same'),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation='relu', padding='same'),
MaxPooling2D((2, 2)),
Conv2D(128, (3, 3), activation='relu', padding='same'),
MaxPooling2D((2, 2)),
GlobalAveragePooling2D(), # ✅ Fix: Reduce 3D to 1D
Dense(128, activation='relu', kernel_regularizer=l2(0.001)),
Dropout(0.5),
Dense(len(class_names), activation='softmax')
])
basic_cnn_model_3.compile(optimizer=Adam(learning_rate=1e-3),
loss='categorical_crossentropy',
metrics=['accuracy'])
basic_cnn_model_2.summary()
Model: "sequential_73"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d_278 (Conv2D) │ (None, 128, 128, 32) │ 896 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_176 │ (None, 128, 128, 32) │ 128 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_271 │ (None, 64, 64, 32) │ 0 │ │ (MaxPooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_279 (Conv2D) │ (None, 64, 64, 64) │ 18,496 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_177 │ (None, 64, 64, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_272 │ (None, 32, 32, 64) │ 0 │ │ (MaxPooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_280 (Conv2D) │ (None, 32, 32, 64) │ 36,928 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ batch_normalization_178 │ (None, 32, 32, 64) │ 256 │ │ (BatchNormalization) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_273 │ (None, 16, 16, 64) │ 0 │ │ (MaxPooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ global_average_pooling2d_21 │ (None, 64) │ 0 │ │ (GlobalAveragePooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_147 (Dense) │ (None, 128) │ 8,320 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_191 (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_148 (Dense) │ (None, 10) │ 1,290 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 199,072 (777.63 KB)
Trainable params: 66,250 (258.79 KB)
Non-trainable params: 320 (1.25 KB)
Optimizer params: 132,502 (517.59 KB)
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint
# Callbacks for better training control
callbacks = [
EarlyStopping(monitor='val_loss', min_delta=0.01, patience=20, verbose=1, mode='auto'),
ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=10, verbose=1, mode='auto'),
ModelCheckpoint("basic_cnn_model_3_augmented.weights.h5", save_best_only=True, save_weights_only=True ,verbose=1)
]
# Train the model
basic_cnn_model_3_history = basic_cnn_model_3.fit(
train_generator,
validation_data=val_generator,
epochs=50,
callbacks=callbacks
)
Epoch 1/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step - accuracy: 0.1029 - loss: 2.4353 Epoch 1: val_loss improved from inf to 2.41134, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 125ms/step - accuracy: 0.1018 - loss: 2.4350 - val_accuracy: 0.0980 - val_loss: 2.4113 - learning_rate: 0.0010 Epoch 2/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1069 - loss: 2.4060 Epoch 2: val_loss improved from 2.41134 to 2.39294, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 107ms/step - accuracy: 0.1059 - loss: 2.4058 - val_accuracy: 0.0980 - val_loss: 2.3929 - learning_rate: 0.0010 Epoch 3/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step - accuracy: 0.1001 - loss: 2.3907 Epoch 3: val_loss improved from 2.39294 to 2.37787, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 107ms/step - accuracy: 0.1009 - loss: 2.3903 - val_accuracy: 0.0980 - val_loss: 2.3779 - learning_rate: 0.0010 Epoch 4/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step - accuracy: 0.0970 - loss: 2.3741 Epoch 4: val_loss improved from 2.37787 to 2.36468, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 107ms/step - accuracy: 0.0972 - loss: 2.3740 - val_accuracy: 0.0980 - val_loss: 2.3647 - learning_rate: 0.0010 Epoch 5/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1140 - loss: 2.3578 Epoch 5: val_loss improved from 2.36468 to 2.35570, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 106ms/step - accuracy: 0.1137 - loss: 2.3579 - val_accuracy: 0.0980 - val_loss: 2.3557 - learning_rate: 0.0010 Epoch 6/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step - accuracy: 0.1066 - loss: 2.3504 Epoch 6: val_loss improved from 2.35570 to 2.34292, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 107ms/step - accuracy: 0.1065 - loss: 2.3500 - val_accuracy: 0.0980 - val_loss: 2.3429 - learning_rate: 0.0010 Epoch 7/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1224 - loss: 2.3263 Epoch 7: val_loss improved from 2.34292 to 2.32675, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 107ms/step - accuracy: 0.1220 - loss: 2.3264 - val_accuracy: 0.1765 - val_loss: 2.3268 - learning_rate: 0.0010 Epoch 8/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 114ms/step - accuracy: 0.1551 - loss: 2.2939 Epoch 8: val_loss improved from 2.32675 to 2.32509, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 123ms/step - accuracy: 0.1535 - loss: 2.2942 - val_accuracy: 0.1176 - val_loss: 2.3251 - learning_rate: 0.0010 Epoch 9/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1242 - loss: 2.2611 Epoch 9: val_loss did not improve from 2.32509 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 105ms/step - accuracy: 0.1242 - loss: 2.2639 - val_accuracy: 0.1765 - val_loss: 2.3256 - learning_rate: 0.0010 Epoch 10/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1368 - loss: 2.3041 Epoch 10: val_loss improved from 2.32509 to 2.31599, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 107ms/step - accuracy: 0.1375 - loss: 2.3035 - val_accuracy: 0.1765 - val_loss: 2.3160 - learning_rate: 0.0010 Epoch 11/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1779 - loss: 2.2671 Epoch 11: val_loss did not improve from 2.31599 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 106ms/step - accuracy: 0.1770 - loss: 2.2685 - val_accuracy: 0.1765 - val_loss: 2.3210 - learning_rate: 0.0010 Epoch 12/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step - accuracy: 0.1468 - loss: 2.2684 Epoch 12: val_loss did not improve from 2.31599 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 104ms/step - accuracy: 0.1468 - loss: 2.2690 - val_accuracy: 0.0784 - val_loss: 2.3220 - learning_rate: 0.0010 Epoch 13/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step - accuracy: 0.1435 - loss: 2.2808 Epoch 13: val_loss improved from 2.31599 to 2.31398, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 107ms/step - accuracy: 0.1441 - loss: 2.2805 - val_accuracy: 0.1373 - val_loss: 2.3140 - learning_rate: 0.0010 Epoch 14/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1720 - loss: 2.2600 Epoch 14: val_loss did not improve from 2.31398 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 106ms/step - accuracy: 0.1709 - loss: 2.2603 - val_accuracy: 0.1373 - val_loss: 2.3240 - learning_rate: 0.0010 Epoch 15/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step - accuracy: 0.2041 - loss: 2.2371 Epoch 15: val_loss improved from 2.31398 to 2.30830, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 109ms/step - accuracy: 0.2012 - loss: 2.2386 - val_accuracy: 0.1176 - val_loss: 2.3083 - learning_rate: 0.0010 Epoch 16/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step - accuracy: 0.1543 - loss: 2.2536 Epoch 16: val_loss did not improve from 2.30830 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 110ms/step - accuracy: 0.1543 - loss: 2.2533 - val_accuracy: 0.1176 - val_loss: 2.3179 - learning_rate: 0.0010 Epoch 17/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 145ms/step - accuracy: 0.1325 - loss: 2.2675 Epoch 17: val_loss did not improve from 2.30830 13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 153ms/step - accuracy: 0.1340 - loss: 2.2663 - val_accuracy: 0.1569 - val_loss: 2.3245 - learning_rate: 0.0010 Epoch 18/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 111ms/step - accuracy: 0.1267 - loss: 2.2575 Epoch 18: val_loss did not improve from 2.30830 13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 118ms/step - accuracy: 0.1267 - loss: 2.2568 - val_accuracy: 0.0784 - val_loss: 2.3599 - learning_rate: 0.0010 Epoch 19/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step - accuracy: 0.1687 - loss: 2.1768 Epoch 19: val_loss did not improve from 2.30830 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 108ms/step - accuracy: 0.1683 - loss: 2.1799 - val_accuracy: 0.1765 - val_loss: 2.3215 - learning_rate: 0.0010 Epoch 20/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step - accuracy: 0.1261 - loss: 2.2535 Epoch 20: val_loss improved from 2.30830 to 2.28959, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 113ms/step - accuracy: 0.1282 - loss: 2.2516 - val_accuracy: 0.1373 - val_loss: 2.2896 - learning_rate: 0.0010 Epoch 21/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1866 - loss: 2.2137 Epoch 21: val_loss improved from 2.28959 to 2.28348, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 106ms/step - accuracy: 0.1849 - loss: 2.2144 - val_accuracy: 0.1569 - val_loss: 2.2835 - learning_rate: 0.0010 Epoch 22/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step - accuracy: 0.1842 - loss: 2.1910 Epoch 22: val_loss did not improve from 2.28348 13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 113ms/step - accuracy: 0.1834 - loss: 2.1923 - val_accuracy: 0.1765 - val_loss: 2.4526 - learning_rate: 0.0010 Epoch 23/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 111ms/step - accuracy: 0.1488 - loss: 2.2205 Epoch 23: val_loss did not improve from 2.28348 13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 118ms/step - accuracy: 0.1491 - loss: 2.2199 - val_accuracy: 0.1765 - val_loss: 2.3051 - learning_rate: 0.0010 Epoch 24/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1608 - loss: 2.2162 Epoch 24: val_loss did not improve from 2.28348 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 106ms/step - accuracy: 0.1617 - loss: 2.2150 - val_accuracy: 0.1765 - val_loss: 2.2990 - learning_rate: 0.0010 Epoch 25/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step - accuracy: 0.1585 - loss: 2.2131 Epoch 25: val_loss did not improve from 2.28348 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 106ms/step - accuracy: 0.1601 - loss: 2.2114 - val_accuracy: 0.1765 - val_loss: 2.4944 - learning_rate: 0.0010 Epoch 26/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step - accuracy: 0.1874 - loss: 2.1786 Epoch 26: val_loss did not improve from 2.28348 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 105ms/step - accuracy: 0.1877 - loss: 2.1786 - val_accuracy: 0.1373 - val_loss: 2.3706 - learning_rate: 0.0010 Epoch 27/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1785 - loss: 2.1749 Epoch 27: val_loss did not improve from 2.28348 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 106ms/step - accuracy: 0.1783 - loss: 2.1748 - val_accuracy: 0.1569 - val_loss: 2.4816 - learning_rate: 0.0010 Epoch 28/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step - accuracy: 0.2349 - loss: 2.1180 Epoch 28: val_loss did not improve from 2.28348 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 111ms/step - accuracy: 0.2324 - loss: 2.1211 - val_accuracy: 0.1961 - val_loss: 2.2991 - learning_rate: 0.0010 Epoch 29/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step - accuracy: 0.1915 - loss: 2.1684 Epoch 29: val_loss did not improve from 2.28348 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 101ms/step - accuracy: 0.1925 - loss: 2.1677 - val_accuracy: 0.1176 - val_loss: 2.5237 - learning_rate: 0.0010 Epoch 30/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step - accuracy: 0.2175 - loss: 2.1183 Epoch 30: val_loss did not improve from 2.28348 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 106ms/step - accuracy: 0.2161 - loss: 2.1197 - val_accuracy: 0.1176 - val_loss: 2.9982 - learning_rate: 0.0010 Epoch 31/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 109ms/step - accuracy: 0.1844 - loss: 2.2093 Epoch 31: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257. Epoch 31: val_loss did not improve from 2.28348 13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 117ms/step - accuracy: 0.1862 - loss: 2.2079 - val_accuracy: 0.1765 - val_loss: 2.4721 - learning_rate: 0.0010 Epoch 32/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step - accuracy: 0.2430 - loss: 2.1173 Epoch 32: val_loss did not improve from 2.28348 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 104ms/step - accuracy: 0.2415 - loss: 2.1193 - val_accuracy: 0.2157 - val_loss: 2.4675 - learning_rate: 5.0000e-04 Epoch 33/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.2376 - loss: 2.1319 Epoch 33: val_loss improved from 2.28348 to 2.28116, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 107ms/step - accuracy: 0.2377 - loss: 2.1321 - val_accuracy: 0.1569 - val_loss: 2.2812 - learning_rate: 5.0000e-04 Epoch 34/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step - accuracy: 0.1982 - loss: 2.1544 Epoch 34: val_loss improved from 2.28116 to 2.27935, saving model to basic_cnn_model_3_augmented.weights.h5 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 106ms/step - accuracy: 0.1987 - loss: 2.1532 - val_accuracy: 0.1373 - val_loss: 2.2794 - learning_rate: 5.0000e-04 Epoch 35/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step - accuracy: 0.2158 - loss: 2.0916 Epoch 35: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 102ms/step - accuracy: 0.2158 - loss: 2.0919 - val_accuracy: 0.2157 - val_loss: 2.6016 - learning_rate: 5.0000e-04 Epoch 36/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step - accuracy: 0.2390 - loss: 2.0685 Epoch 36: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 103ms/step - accuracy: 0.2387 - loss: 2.0703 - val_accuracy: 0.1765 - val_loss: 2.8312 - learning_rate: 5.0000e-04 Epoch 37/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step - accuracy: 0.2427 - loss: 2.0642 Epoch 37: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 105ms/step - accuracy: 0.2408 - loss: 2.0670 - val_accuracy: 0.1373 - val_loss: 2.5870 - learning_rate: 5.0000e-04 Epoch 38/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step - accuracy: 0.2857 - loss: 1.9970 Epoch 38: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 104ms/step - accuracy: 0.2838 - loss: 2.0013 - val_accuracy: 0.1961 - val_loss: 2.3187 - learning_rate: 5.0000e-04 Epoch 39/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step - accuracy: 0.2769 - loss: 2.0111 Epoch 39: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 104ms/step - accuracy: 0.2756 - loss: 2.0149 - val_accuracy: 0.2157 - val_loss: 2.3380 - learning_rate: 5.0000e-04 Epoch 40/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step - accuracy: 0.2782 - loss: 2.0538 Epoch 40: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 105ms/step - accuracy: 0.2771 - loss: 2.0543 - val_accuracy: 0.2353 - val_loss: 2.7615 - learning_rate: 5.0000e-04 Epoch 41/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 96ms/step - accuracy: 0.2716 - loss: 2.0798 Epoch 41: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 103ms/step - accuracy: 0.2709 - loss: 2.0784 - val_accuracy: 0.1765 - val_loss: 2.5901 - learning_rate: 5.0000e-04 Epoch 42/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step - accuracy: 0.2259 - loss: 2.0563 Epoch 42: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 104ms/step - accuracy: 0.2270 - loss: 2.0548 - val_accuracy: 0.2745 - val_loss: 2.5531 - learning_rate: 5.0000e-04 Epoch 43/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step - accuracy: 0.2865 - loss: 2.0169 Epoch 43: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 102ms/step - accuracy: 0.2856 - loss: 2.0175 - val_accuracy: 0.1961 - val_loss: 2.5566 - learning_rate: 5.0000e-04 Epoch 44/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step - accuracy: 0.2587 - loss: 2.0164 Epoch 44: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628. Epoch 44: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 102ms/step - accuracy: 0.2582 - loss: 2.0183 - val_accuracy: 0.1961 - val_loss: 2.4996 - learning_rate: 5.0000e-04 Epoch 45/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step - accuracy: 0.2501 - loss: 2.0373 Epoch 45: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 102ms/step - accuracy: 0.2535 - loss: 2.0328 - val_accuracy: 0.2353 - val_loss: 2.4919 - learning_rate: 2.5000e-04 Epoch 46/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step - accuracy: 0.2775 - loss: 1.9373 Epoch 46: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 104ms/step - accuracy: 0.2767 - loss: 1.9416 - val_accuracy: 0.2353 - val_loss: 2.4057 - learning_rate: 2.5000e-04 Epoch 47/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step - accuracy: 0.2626 - loss: 2.0460 Epoch 47: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 101ms/step - accuracy: 0.2640 - loss: 2.0436 - val_accuracy: 0.2353 - val_loss: 2.6130 - learning_rate: 2.5000e-04 Epoch 48/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.2824 - loss: 1.9980 Epoch 48: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 105ms/step - accuracy: 0.2824 - loss: 1.9959 - val_accuracy: 0.2353 - val_loss: 2.4668 - learning_rate: 2.5000e-04 Epoch 49/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 109ms/step - accuracy: 0.2603 - loss: 1.9838 Epoch 49: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 118ms/step - accuracy: 0.2612 - loss: 1.9826 - val_accuracy: 0.2549 - val_loss: 2.3397 - learning_rate: 2.5000e-04 Epoch 50/50 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 97ms/step - accuracy: 0.3040 - loss: 1.9461 Epoch 50: val_loss did not improve from 2.27935 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 104ms/step - accuracy: 0.3028 - loss: 1.9465 - val_accuracy: 0.2941 - val_loss: 2.6083 - learning_rate: 2.5000e-04
test_loss, test_acc = basic_cnn_model_3.evaluate(test_generator)
train_loss, train_acc = basic_cnn_model_3.evaluate(train_generator)
print(f"Test Accuracy: {test_acc*100:.2f}%")
print(f"Train Accuracy: {train_acc*100:.2f}%")
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step - accuracy: 0.2292 - loss: 2.5131 13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 79ms/step - accuracy: 0.3284 - loss: 1.8571 Test Accuracy: 25.00% Train Accuracy: 30.49%
def plot_training_history_generator(history, model, test_generator, model_name="Model"):
import matplotlib.pyplot as plt
# Plot
plt.figure(figsize=(12, 5))
# Accuracy
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Val Accuracy')
plt.title(f'{model_name} - Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.grid(True)
# Loss
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Val Loss')
plt.title(f'{model_name} - Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
# Print last epoch stats
print(f"\n📊 Final Training Accuracy: {history.history['accuracy'][-1]:.4f}")
print(f"📊 Final Validation Accuracy: {history.history['val_accuracy'][-1]:.4f}")
# Evaluate on test set
test_loss, test_acc = model.evaluate(test_generator, verbose=0)
print(f"\n🧪 Test Accuracy: {test_acc:.4f}")
print(f"🧪 Test Loss : {test_loss:.4f}")
plot_training_history_generator(basic_cnn_model_3_history, basic_cnn_model_3, test_generator=test_generator, model_name="Basic CNN 3")
📊 Final Training Accuracy: 0.2878 📊 Final Validation Accuracy: 0.2941 🧪 Test Accuracy: 0.2500 🧪 Test Loss : 2.4765
from sklearn.metrics import classification_report, confusion_matrix
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
def evaluate_model_predictions(model, test_generator, class_names):
# Step 1: Predict
predictions = model.predict(test_generator)
predicted_classes = np.argmax(predictions, axis=1)
true_classes = test_generator.classes
# Step 2: Classification Report
print("Classification Report:")
print(classification_report(true_classes, predicted_classes, target_names=class_names,))
# Step 3: Confusion Matrix
cm = confusion_matrix(true_classes, predicted_classes)
plt.figure(figsize=(12, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=class_names, yticklabels=class_names)
plt.xlabel("Predicted")
plt.ylabel("True")
plt.title("Confusion Matrix")
plt.show()
evaluate_model_predictions(basic_cnn_model_3, test_generator, class_names)
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step Classification Report: precision recall f1-score support Apple Pie 0.00 0.00 0.00 5 Chocolate 0.25 0.80 0.38 5 French Fries 0.33 0.20 0.25 5 Hotdog 0.00 0.00 0.00 5 Nachos 0.00 0.00 0.00 5 Pizza 0.33 0.60 0.43 5 onion_rings 0.00 0.00 0.00 5 pancakes 0.25 0.33 0.29 6 spring_rolls 0.00 0.00 0.00 6 tacos 0.23 0.60 0.33 5 accuracy 0.25 52 macro avg 0.14 0.25 0.17 52 weighted avg 0.14 0.25 0.17 52
import random
import matplotlib.pyplot as plt
import numpy as np
def plot_random_predictions_generator(test_generator, class_names, model, num_samples=5):
"""
Plots random samples from one batch of the test_generator with predicted and actual labels.
Correct predictions are shown in green, incorrect in red.
Args:
test_generator (DirectoryIterator or DataFrameIterator): Keras test data generator
class_names (list): List of class names corresponding to label indices
model (keras.Model): Trained classification model
num_samples (int): Number of random samples to display (default: 5)
"""
# Get one batch of data
images, labels = next(test_generator)
# Limit num_samples to batch size
num_samples = min(num_samples, images.shape[0])
indices = random.sample(range(images.shape[0]), num_samples)
cols = 5
rows = (num_samples + cols - 1) // cols
plt.figure(figsize=(cols * 3, rows * 3))
for i, idx in enumerate(indices):
img = images[idx]
true_label = np.argmax(labels[idx])
pred_probs = model.predict(np.expand_dims(img, axis=0), verbose=0)
pred_label = np.argmax(pred_probs)
color = 'green' if pred_label == true_label else 'red'
title_text = f"Pred: {class_names[pred_label]}\nActual: {class_names[true_label]}"
plt.subplot(rows, cols, i + 1)
plt.imshow(img)
plt.title(title_text, color=color, fontsize=10)
plt.axis('off')
plt.suptitle("Model Predictions on Test Generator Batch", fontsize=16)
plt.tight_layout()
plt.subplots_adjust(top=0.85)
plt.show()
plot_random_predictions_generator(test_generator, class_names, basic_cnn_model_3, num_samples=20)
CNN Model Comparison Report¶
Overview¶
| Aspect | Model 1 | Model 2 | Model 3 |
|---|---|---|---|
| Architecture | Basic CNN (3 conv layers) | Deep CNN with 5 conv blocks | Deep CNN + Data Augmentation |
| Overfitting | Yes – severe | No – well-regularized | No – regularized and generalized better |
| Regularization | Dropout only | Dropout + BatchNorm | Dropout + BatchNorm + Data Augmentation |
| Learning Curve | Early overfit | Steady slow learning | Gradual, steady learning with improved accuracy |
Detailed Observations¶
1. Underfitting in Model 1 & 2, Better Learning in Model 3¶
- Models 1 and 2 showed low training/validation accuracy, indicating underfitting.
- Model 3 showed consistent improvement (e.g., ~22.5% validation accuracy by epoch 6).
- Data augmentation helped learn more generalized features.
2. Low Precision, Recall, and F1-Scores¶
- Most classes had low recall and F1-scores, especially in Models 1 & 2.
- Some improvement in Model 3 for certain distinctive classes (e.g.,
spring_rolls,chocolate).
3. Improvements from Augmentation¶
- Model 3 used
ImageDataGeneratorto apply:- Rotation, zoom, flips, brightness adjustments, etc.
- Helped mitigate overfitting and improved learning.
4. Learning Stability¶
- Model 3 showed gradual and stable learning (no sharp spikes).
- Training and validation loss decreased in sync.
Model Architecture Feedback¶
Common Weaknesses (Models 1 & 2):¶
- Shallow CNNs with limited filters.
- Use of
Flatten()increased overfitting risk. - Lack of complex feature extractors or residual connections.
Model 3 Improvements:¶
- Introduced BatchNormalization, Dropout, and Augmentation.
- More stable validation metrics, though still modest performance.
Recommendations¶
Model Enhancements:¶
- Add more
Conv2Dblocks with higher filter sizes. - Use BatchNormalization and Dropout after every block.
- Replace
Flatten()withGlobalAveragePooling2D.
Data Handling:¶
- Keep aggressive
ImageDataGeneratorusage. - Use class weights or oversampling to address class imbalance.
Training Strategy:¶
- Train for 50–100 epochs.
- Include:
EarlyStoppingReduceLROnPlateau- Possibly learning rate warm-up or scheduling.
Upgrade to Transfer Learning:¶
- Use a pretrained model like:
MobileNetV2,EfficientNetB0, orResNet50.
- Freeze base layers and fine-tune last few layers.
- Ideal for small datasets and many classes.
Conclusion¶
- Model 3 showed clear improvements over Models 1 & 2.
- Still limited by:
- Modest architecture
- Small, imbalanced dataset
Next Steps:
- Shift to transfer learning (Use a pretrained model like:
MobileNetV2,EfficientNetB0, orResNet50. - Improve data diversity
- Scale training efforts
Transfer learning + augmentation is the most effective path forward.